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Abstract: We apply state-ol-the art data analysis meth- 
ods to a number of fictitious CMB mapping experiments, in- 
cluding 1// noise, distilling the cosmological information from 
time-ordered data to maps to power spectrum estimates, and 
find that in all cases, the resulting error bars can we well 
approximated by simple and intuitive analytic expressions. 
Using these approximations, we discuss how to maximize the 
scientific return of CMB mapping experiments given the prac- 
tical constraints at hand, and our main conclusions are as 
follows. (1) For a given resolution and sensitivity, it is best 
to cover a sky area such that the signal-to-noise ratio per 
resolution element (pixel) is of order unity. (2) It is best to 
avoid excessively skinny observing regions, narrower than a 
few degrees. (3) The minimum-variance mapmaking method 
can reduce the effects of 1// noise by a substantial factor, but 
only if the scan pattern is thoroughly interconnected. (4) 1// 
noise produces a 1/f contribution to the angular power spec- 
trum for well connected single-beam scanning, as compared 
to virtually white noise for a two-beam scan pattern such as 
that of the MAP satellite. 



I. INTRODUCTION 

Over the next decade, precision measurements of the 
cosmic microwave background (CMB) are likely to radi- 
cally tighten existing constraints on cosmological models. 
Although some upcoming experiments, e.g., the NASA 
MAP Satellite, already have their design and observing 
strategy essentially frozen in, many others do not, and 
face important tradeoffs between figures of merit such as 
resolution, sky coverage, frequency coverage and sensi- 
tivity. For instance, is it better to concentrate a given 
amount of observing time on a small patch, thereby im- 
proving the signal-to-noise per pixel, or to map a large 
area with lower accuracy? The purpose of this pa- 
per is to investigate how such tradeoffs affect the accu- 
racy with which cosmological models can be constrained, 
thereby providing some guidance for observers attempt- 
ing to maximize the scientific "bang for the buck" of their 
experiments. 



A. From maps to cosmology 

The approach taken with the first CMB experiments 
was to use numerical likelihood or Monte Carlo calcula- 
tions to assess the accuracy with which various parame- 
ters could be measured from the data, ft has gradually 
become clear that although such calculations are useful 
post hoc, to compute accurate error bars once the ex- 
periment has taken place and the data set is in hand, 
simple and intuitive analytic approximations exist that 
are often accurate enough for studying the effects of de- 
sign tradeoffs. For instance, it was shown that the effect 
of incomplete sky coverage is well approximated by two 
simple effects: to increase the sample variance by a fac- 
tor 1/ f s ky 0], where f s k y is the fraction of the sky area 
that is observed, and to smear out features in the power 
spectrum on a scale A£ ~ l /A6 , where A8 is the size 
of the patch (in radians) in its narrowest direction. In a 
similar spirit, Knox showed that the effect of uniform in- 
strumental noise could be accurately modeled as simply 
an additional random field on the sky, with an angular 
power spectrum given by ||0] 
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and we give a detailed proof of this in Appendix A. Here 
a is the r.m.s. noise in each of the N pixels, the solid 
angle covered is = Airf s k y , s is the detector sensitivity 
in units /iKs 1 / 2 , t b s is the total observation time, and 
w is the raw sensitivity measure defined by a 
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Bi is the experimental beam function, which for a Gaus- 
sian beam with standard deviation 0^ is well approxi- 
mated by 



(3) 



Thus early estimates of how accurately cosmological pa- 
rameters could be measured based on Monte Carlo maps 
{e.g. H) could be substantially accelerated. A further 
simplification was achieved by altogether eliminating the 



'Hubble Fellow. 



1 The FWHM (full-width-half-maximum) is given 
FWHM^v^fi,. 
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likelihood minimization (performed say by simulated an- 
nealing S), and computing the attainable error bars di- 
rectly from the power spectrum and its derivatives 
This procedure involves the formalism of the Fisher In- 
formation Matrix (described in detail in 0), and has 
now been used to study the accuracy with which about 
a dozen cosmological parameters can be simultaneously 
measured by MAP and the ESA Planck mission [|] [L0| , 
going substantially beyond the obvious conclusions that 
it helps to increase the sky and frequency coverage, the 
resolution and the sensitivity. It was found that by in- 
creasing the angular resolution to FWHM< 1°, thereby 
measuring the power spectrum well beyond the first 
"Doppler peak" , much of the degeneracy between dif- 
ferent parameters that had been termed "cosmic confu- 
sion" fnl] could be lifted, with Planck measuring most 
parameters to within a few percent. Measuring polar- 
ization as well was found to improve the accuracy by a 
further factor of two assuming that foreground and sys- 
tematics problems could be controlled iQ. Using the 
same method, a number of experimental design issues 
for both single-dish experiments and interferometers have 
been discussed with the attention limited to measuring 
the density parameter Q lO] ("weighing the Universe") 
and the observability of a second Doppler peak 13 1. 



B. From time-ordered data to maps 

All the above-mentioned results focused on the link 
between completed CMB maps and cosmological con- 
straints. In the presence of detector 1/f noise, how- 
ever, it is important to pay attention also to the previous 
step in the data-analysis pipeline, where the time-ordered 
data (TOD) is reduced to a map. Handy approximations 
for the impact of 1 // noise when circular scans are aver- 
aged have been derived jl4| , and it is clear that the scan 
strategy (by which we mean not merely how many times 
different pixels are observed, but also in what order) has 
a substantial impact on the attainable noise levels in the 
map. It has been argued jl5| that it is desirable to have 
a scan strategy that is as "connected" as possible, where 
each pixel is scanned through in many different direc- 
tions. 



sible) brute force likelihood analysis of the entire time 
ordered data set. A recent clever implementation of the 
minimum-variance method for reducing TOD to maps 
|]l5| is both feasible and lossless in this sense Jl(| (all the 
cosmological information from the TOD is distilled into 
the map with nothing leaking out of the pipeline). A 
feasible and lossless power spectrum estimation has also 
been found 0Jl^l for the case of Gaussian fluctuations. 
The signal-to-noise eigenmode method (see pp|-p2|,f7 24 
and references therein) offers a feasible and lossless way 
of constraining parameters directly from maps as long as 
the number of pixels n < 10 4 , as do the orthogonalized 
spherical harmonic p5| and brute-force (26|,p7| methods. 



D. Outline 

In this paper, we will adopt an approach to experi- 
mental design which combines the accuracy of these new 
numerical methods with the intuitive understanding of 
the analytical approximations. This has essentially not 
been done before. For instance, the published 1/f ap- 
proximations p4J w ere not based on the lossless map- 
making method fl5|Jlr|] , but on a straight pixel averaging 
which can be improved upon in many situations, and 
the resulting angular power spectrum of the noise was 
not computed exactly given 1 / / n oise, merely estimated 
with Monte Carlo simulations jL4|. Similarly, the above- 
mentioned sample variance approximation was derived 
assuming a Gaussian autocorrelation function al- 
though as we will see, it is readily generalized to a signal- 
to-noise or power spectrum analysis. We will present a 
number of worked examples, using the above-mentioned 
lossless data analysis methods, and show how in each 
case, these results can be accurately matched by simple 
approximations. We then use these approximations to 
arrive at rules of thumb for experimental design. Sec- 
tion 2 discusses the effect of varying four attributes of a 
map; its size, shape, sensitivity and resolution. (For a 
discussion on the best choice of frequency channels with 
regard to foreground removal, see e.g. (2^j2^Q.) Section 
3 discusses the preceding mapmaking step, and how two 
attributes of the scan pattern (the 1/f noise level and the 
amount of interconnectedness in the scan pattern) affect 
the noise power spectrum in the map. 



C. New data-analysis techniques 

Substantial progress has recently been made on the 
issue of how to analyze a given data set. Computa- 
tionally feasible methods are now available for reducing 
data sets as large as those of the upcoming satellite mis- 
sions from time-ordered data to maps and from maps 
to power spectra and cosmological parameter constraints 
in a way that destroys no cosmological information, in 
the sense that parameters can be measured just as ac- 
curately as they could with a (computationally unfea- 



II. FROM MAP TO COSMOLOGY 

In this section, we analyze a number of different types 
of maps with the signal-to-noise eigenmode method and 
the lossless power spectrum method, focusing on the ef- 
fect of varying the map size, shape, sensitivity and reso- 
lution. 
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A. Signal-to-noise eigenmodes: demystifying the 
black box 

The signal-to-noise (S/N) eigenmode method distills 
the information content of a CMB map into a set of mu- 
tually exclusive and collectively exhaustive chunks which 
have a number of properties that make them useful for 
measuring the CMB power spectrum and constraining 
cosmological models. Although it has traditionally been 
a "black box" method, where all the details are hidden in 
the numerical diagonalization of a large matrix, we will 
see below that the workings of this box are in fact easy 
to understand both qualitatively and quantitatively by 
making some simple approximations.^ 

B. A minimalistic review of the S/N method 

The S /N eigenmode method was introduced into CMB 
data analysis by Bond |20) and Bunn |22| , who both rein- 
vented the method independently. It is a special case 
of the Karhunen-Loeve method |2^] , and since our focus 
here is not on data analysis methods but on experimental 
design, our review below is very brief and the interested 
reader is referred to other recent papers [Q|24|] for method 
details. Suppose the CMB map is pixclizcd into N pixels 
whose center positions in the sky are given by the unit 
vectors ?i, ?2, rjv- The map consists of N numbers 
Xi = Xi + Hi, where X4 = ST(ji) are the true sky temper- 
atures and rii are the instrumental noise contributions. 
We group these numbers into A-dimensional vectors x, x 
and n, respectively, so x = x + n. The signal x and noise 
n are assumed to have zero mean ((x) = (n) = 0), to 
be uncorrelated ((xn*) = 0), and to have a multivariate 
Gaussian probability distribution with covariance matri- 
ces S = ( xxt ) an( i N = (nn*). The data covariance 
matrix is thus C = (xx*) — S + N. The signal-to-noise 
eigenmodes are the N vectors hi satisfying the general- 
ized eigenvalue equation 

Sb, = A,Nb t . (4) 

Grouping them together as the columns of the N x N 
matrix B, one computes a new data vector y = B*x. 
These N numbers yi are the above-mentioned informa- 
tion chunks. They are mutually exclusive in the sense 
that they are uncorrelated ({ViVj) — b-Cb, = [1 + Ai]5ij) 
and collectively exhaustive in the sense that they retain 
all the information from the original data set (since x 
can be recovered by computing B~'y, as B is invert- 
ible). Moreover, the eigenvalues Xi can be interpreted 



2 An interesting step in this direction was a handy approxi- 
mation for the special case of the MS AM experiment chopping 



as signal-to- noise ratios for the coefficients ?/,, and sort- 
ing them by decreasing signal-to-noise, y\ can be shown 
to contain the most information about the power spec- 
trum normalization, followed by ?/2, V3, etc. Typically, 
the bulk of the coefficients are so noisy that they can be 
thrown away without appreciable loss of cosmological in- 
formation, and such data compression has the advantage 
of greatly accelerating subsequent analysis such as likeli- 
hood computations, where the CPU time typically scales 
as the cube of the size of the data set. 

The expectation value (yf) generally equals a noise 
term stemming from N plus a linear combination of 
8Tf = 1(1 + l)Cg/2n, where the weights given to the dif- 
ferent 8Tf are denoted , the window function. Here 
Ce is the customary angular power spectrum, and the 
window functions are given by (e.g. 

N N 
^ ' ] = 1 fc=l 

where Pi denotes Legendre polynomials. W is normal- 
ized so that J2e wi = 1, so we can think of yf as mea- 
suring a weighted average of the power spectrum coeffi- 
cients 8Tf, with the window function giving the weights. 
As the examples below will illustrate, the coefficients y; 
generally have the additional advantage of being fairly 
localized in the Fourier (multipole) domain, by which is 
meant that they have narrow window functions, and this 
makes them useful for band power measurements. 

C. Case study 1: round maps 

Let us first consider a CMB map with an angular res- 
olution 9b covering the sky area within an angle 9 from 
some given point. For 9 <C 1, this region will simply be 
a (rather flat) disk of radius 9, whereas 9 = tt gives a full 
sky map. The sky fraction covered is 

f sky = sm 2 -. (6) 

We discretize the map into N equal-area pixels which 
we assume to have uncorrelated Gaussian noise with an 
r.m.s. amplitude a. To keep things simple, we use a flat 
fiducial power spectrum C'i cx l/£(£ + 1) with a Q = 
30/iK quadrupolc normalization, corresponding to STi = 
(12/5) 1 / 2 Q ~ 47^iK, a ball park figure for recent degree- 
scale measurements. 

Fig. [l] shows the eigenmodes for the case 9 = 5°. Some 
of the corresponding window functions are plotted in 
Fig. ||, and the eigenvalues Xi are shown in Fig. ||. As 
we will now describe, the contents of all of these figures 
could have been approximately predicted without ever 
carrying out the full numerical calculation. 
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1. Eigenmodes: 

Let us first consider the eigenmodes. The exact choice 
of pixelization is clearly irrelevant as long as the pixel 
separation is much smaller than the beamwidth, so let 
us simplify the problem by considering an infinitely fine 
pixelization, where the eigenmodes are smooth functions 
bi(r). The spherical harmonics Yi m are eigenfunctions 
of the Laplacian A with eigenvalues £(£ + 1), so mul- 
tiplying by £(£ + 1) in the Fourier (multipole) domain 
is equivalent to applying the angular Laplace operator 
on the sphere. When the fiducial power spectrum is 
Ci oc l/£(£ + 1), we can thus think of the signal co- 
variance matrix in Eq. (Q) as essentially S cx A -1 . Since 
N oc I, the eigenmodes are thus basically eigenfunctions 
of the Laplacian. When 9 <C 1, sky curvature is negligi- 
ble and this reduces to the 2D Helmholtz equation. For 
the circular case at hand, the solutions are well-known to 
correspond to Bessel functions: 

b lm {v) cx J m {k e r)(? m *, (7) 

where r = {x,y) = r(cos <p, sin <j>). Thus each mode is 
specified by some integer m and some radial wave number 
ki. This is verified by our numerical results. Since the 
discrete eigenvectors of Eq. (^) are orthogonal when N oc 
I, the combination of m and kg will be such that the 
functions bg rn (r) are orthogonal as well. 



2. Window functions: 

Fig. [j] also shows that the larger mode numbers tend 
to oscillate more. This reflects the fact that the window 
functions probe increasingly small scales (large £) as the 
mode number increases, which is more clearly illustrated 
in Fig. ^. This is quite a general property of the CMB 
S/N method |2l| , and holds because whereas the signal 
Ci generally decreases with £, the noise power C™ olse 
stays constant and eventually increases. ^ Thus the mode 
with the highest S/N-ratio will probe the largest scales 
to which the map is sensitive, the runner up will be the 
largest scale mode remaining (which is uncorrelated with 
the first one), etc. 

The finite size of the survey places a crude lower limit 
on the width of the any window function || : 

A£>l/A9, (8) 

where AO is the angular extent of the survey in the small- 
est direction, in our case AO ~ 26 (a more careful discus- 
sion of this is given in Section [IE 2 ). This limit is typ- 
ically attained with the decorrelated quadratic method 



3 In Section [II C, we will see that 1// noise can in fact pro- 
duce a falling C" OI3e . Hovever, it generally falls no faster than 



p7| , whereas the S/N- method sometimes does signifi- 
cantly worse. We find that the mean of a window func- 
tion, which we will denote (£) or £ e ^ and define by 



£ 



is numerically well approximated by 



sky 



(9) 



(10) 



This can be understood as follows. If we Fourier trans- 
form a finite patch of a homogeneous random field, the 
Fourier coefficients become correlated over a coherence 
volume in Fourier space whose size is roughly the in- 
verse of that of the patch — this is a well-known ef- 
fect in the context of galaxy surveys pCf] . The situa- 
tion is quite analogous with spherical harmonics Jl|,p^[: 
the number of multipole coefficients that become cor- 
related are roughly l/f s k y . There are £^ nax multipolcs 
Ygm with £ < £ m ax, so one expects to be able to form 
roughly f s ky£m a x uncorrelated linear combinations of 
them. Since the S/N-coefficients are all by construction 
uncorrelated, one therefore expects there to be of order 
k = f s ky£ 2 of them probing scales out to £ e ^ ~ £, in 
agreement with Eq. (|10|). 



3. The signal-to-noise eigenvalues 

As mentioned, a S/N-coefficient measures a weighted 
average of the power spectrum. As long as A£ <C £ and 
the power spectrum lacks sharp features, this average is 
well approximated by the power at £ e ^ , and we arrive 
at the useful approximation 

C. 



(11) 



I 1 whereas the CMB signal falls as 
remains unaffected. 



so our conclusion 



where £t and C 7 e lmse are given by equations (fic| ) 
and ([l]), respectively. As shown in Fig. this approxi- 
mation is generally quite accurate. Symmetries tend to 
cause groups of modes to be degenerate, with identical 
eigenvalues, causing horizontal lines to be visible for the 
first modes. For the all-sky case, the 2£ + 1 multipoles 
corresponding to different m-values are degenerate, and 
for azimuthally symmetric regions, the eigenvalues come 
in pairs corresponding to a sine and a cosine mode. At 
the opposite end, the very last modes are seen to con- 
tain even less signal than predicted by Eq. (JTT|). This is 
because the effect of discrete pixelization becomes notice- 
able when the number of modes approaches the number 
of pixels. 

We have tested our approximation for maps of a garden 
variety of shapes and sizes, and in all cases find an ac- 
curacy comparable to that in Fig. [| Because it is both 
accurate and computationally trivial, it is a useful al- 
ternative to full-blown simulations and S/N-calculations 
when studying experimental design issues. 
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D. Lesson 1: how to choose the map size 



in this case, where ACe is given by Eq. (|i"3|). 



Above we found that the accuracy with which band 
powers could be measured was accurately fit by the sim- 
ple approximation given by equations (Q) and (|l0|). Let 
us now use this result to address the following question: 
given a fixed amount of observing time, how large a sky 
area should one spread it over? It is better to scan as 
large an area as practically feasible, or to map a smaller 
patch with a lower noise per pixel (resolution element)? 



1. How accurately can you measure band power? 

Let Ce denote the power Ce averaged over a multipole 
band I - L/2 < I < I + L/2, i.e., a band of width L 
centered on I. How accurately can we measure the band 
power C{1 Eq. (|ll]) showed that as long as L > At, 
each S /N eigenmode whose window function fell into this 
band would measure the power with a signal-to-noise ra- 
tio A w Ce/C™ mse , which corresponds to measuring the 
band power with an r.m.s. error \/2{Ce +C" mse ) since the 
S/N-coefficient has a Gaussian distribution. (These two 
terms correspond to sample variance and noise variance, 
respectively.) From our mode counting above, we know 
that there are Af ~ {21 + l)Lf s k y such eigenmodes prob- 
ing the multipole band, so since they are by construction 
uncorrelated, the r.m.s. error simply drops by a factor 
VJ\f when we use all of them, giving 



ACe 



[C e + CI 



I {21 + l)Lf sky 
This is to be compared with the equation 
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{21 + l)f sky 



[C e + Q 



(12) 



(13) 



which has frequently appeared in the literature and fol- 
lows if we naively set L = 1 in Eq. (|lj). This is of 
course not legitimate when j sky < 1, since Eq. (^) is 
only valid when L 3> At, which is just another way of 
saying that one cannot measure an individual multipole 
Ci alone when faced with incomplete sky coverage. There 
is, nonetheless, a sense in which Eq. ( |l3| ) can be used, 
with the appropriate precautions: as long as the power 
spectrum is smooth enough to be featureless on the scale 
At, calculations assuming that we can make uncorrelated 
measurements of the individual multipolcs with standard 
deviation given by Eq. ( |l3| ) will always give the right an- 
swer. For example, with this assumption, Eq. (|13| ) can 
be used to derive Eq. (|l^). Also, the Fisher informa- 
tion matrix F, which determines the accuracy with which 
cosmological parameters 81,62, ■■■ can be measured 0, is 
correctly given by 



,dC e dC e 
d6i 86, 



(14) 



2. Maximizing the accuracy 

Let us now vary f s k y to minimize the measurement 
error on the band power Ce- Substituting Eq. (Q) into 
Eq. (O) gives 
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(15) 



Requiring the derivative of this with respect to f s k y to 
vanish shows that the best choice of f s k y is 



fsk y = wB 2 Ci = 



NBjCe 
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(16) 



Substituting this back into Eq. (y]), we obtain C™ olse = 
Ci, so we see that this choice corresponds to making the 
noise and sample variance contributions equal. 

This choice of f s k y depends on the multipole I that 
we are trying to measure, so which t should be tailor 
the experiment for? We argue that the natural choice is 
t ~ if, = l/6b, the scale set by the beam resolution, for 
the following reasons: 

1. If one focuses on t <C lb, using the narrow beam is 
like throwing pearls before swine, since one would 
obtain about as good results even with inferior an- 
gular resolution. 

2. If one focuses on f > lb, the beam factor B 2 will 
be exponentially small and the resulting error bars 
will be exponentially large. 

To obtain a rule of thumb for choosing the map size, we 
will therefore maximize the sensitivity to the scale I ~ lb, 
i.e., where the experiment has its strongest comparative 
advantage over others. Since Be b ~ 1, this gives simply 
fsky ~ wC'e b ■ Let us translate this into a more intuitive 
expression. In terms of a flat band power Q 2 , the power 
spectrum at I = lb IS by definition 
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(17) 



If we divide the map area 47r/ s fc a into N pixels of area 
FWHM 2 , FWHM = V&^26 b , then equations (§) 
and (J!?]) together with our result f s k y ~ wCe b gives 



N = 



47r/ sfe . 
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(18) 



In other words, for a given resolution and sensitivity, it 
is best to cover a sky area of order the beam area times 
the signal-to-noise factor Q 2 t b s I s 2 ■ Again using Eq. (|^), 
this tells us that the the noise per pixel should be of order 



a - 2Q. 



(19) 
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Since the r.m.s. CMB fluctuation a cm b in each pixel, 
given by 



0~cmb 



4tt 



1/2 



(20) 



differs from Q only by a logarithmic factor of order a few 
for typical angular resolutions and cosmological models, 
we arrive at the following useful rule of thumb: 

• Choose the map size such that the signal-to-noise 
ratio per pixel, o~ cm b/a, is of order unity. 

Thus if an experiment has a noise level per resolution 
element (pixel) of a <C 50 — 100/iK, it will be dominated 
by sample variance, and better results can be obtained 
by spreading the scan out over a larger sky patch. (Since 
f s ky cannot exceed unity, it of course still makes sense to 
aim for lower noise levels for full-sky experiments, such as 
for instance the upcoming satellite missions.) This rule of 
thumb agrees well with detailed calculations performed 
for the specific case of the MS AM2 experiment p9[ . 



E. Case study 2: oblong maps 

Above we used the fact that both noise and sample 
variance depends only on the area of a map to determine 
the best choice of map size. Changing the shape while 
keeping the area fixed leaves the variance unchanged but 
affects the width of the window functions, the spectral 
resolution. To study this in more detail, we will now 
study the effect of elongating a map, returning to a dis- 
cussion of how to best choose the map shape in the next 
section. Consider a small rectangular map of size 9 x x9 y , 
where 9 X < 9 y <C 1, so that we can neglect the effect of 
sky curvature. 



1. The eigenmodes 

As discussed above, we expect the signal-to-noise 
eigenmodes to be eigenfunctions of the Laplacian, which 
for rectangular symmetry take the form 



b mn {x, y) oc cos(k x x + a) cos(k y y + (3), 



(21) 



where the wave numbers k x and k y and the phases a and 
j3 are such that the modes are orthogonal. (We are using 
coordinates where the map is centered on the north pole, 
sof « {x,y,l), \x\ < 9 X , \y\ < 9 y .) Eq. @ is verified 
by our numerical computations, and illustrated in Fig. |] 
and Fig. [5J where six sample modes are plotted together 
with their window functions. 



2. The window functions 

Fig. U shows that more oblong regions generally pro- 
duce inferior (wider) window functions, but Fig. a illus- 
trates that there is also a strong dependence on whether 
the oscillations are mainly in the narrow or wide direc- 
tion. All of this can be readily understood by consider- 
ing two-dimensional rather than one-dimensional window 
functions, as illustrated in Fig. ^. In the context of 3D 
galaxy redshift surveys, a mode probes a weighted aver- 
age of the power in three-dimensional Fourier space, and 
it is well-known that this weight function (3D window 
function) is simply the square modulus of the Fourier 
transform of the mode itself. The situation is analogous 
in the CMB case jl2|: in the flat sky approximation, we 
can replace the (£, m) multipole space by a 2D Fourier 
space {k x , k y ), and we can compute the 2D window func- 
tion by simply Fourier transforming the signal-to-noise 
eigenmodes of Eq. (pl|), as illustrated in Fig. ^. The 
shape of the 2D window, schematically illustrated by the 
ellipses, basically only depends on the shape of the sky 
patch. It is of order (Ak x ,Ak y ) ~ (9~ 1 ,9~ 1 ), so since 
the 32° x 2° map is 16 times wider than it is high, the 
ellipses corresponding to its window functions are drawn 
16 times higher than wide. The central location of a win- 
dow function is determined by the wave vector (k x , k y ) in 
Eq. (pl|). For instance, mode C in Fig. |^ has k y ~ 0, i.e., 
virtually no vertical oscillations, so its window function 
lies straight to the right of the origin (the careful reader 
will notice that since the eigenmodes contain co$(k x x) 
rather than exp(ik x x), there should be a mirror image 
to the left, but this is omitted to avoid cluttering up the 
figure) . 

The ID window functions plotted in Fig. ^ and Fig. |5| 
depend only £ (which corresponds to the radius k in 
Fig. |^), not on m (roughly corresponding to the angu- 
lar direction), and are essentially the angular average of 
the 2D window functions. The width of the ID window 
functions is therefore determined by how many of the 
concentric circles are crossed in Fig. |fj: 

• In Fig. H, mode A has the worst window function, 
because its longest extent is in the radial direction. 
Its spectral resolution is thus determined by 9 y , the 
narrowest dimension of the patch. 

• Mode C has the best window function, because its 
shortest extent is in the radial direction. In the 
limit £ e ff A£ (where it is very far from the ori- 
gin in Fourier space), its spectral resolution is thus 
determined by 9 X , the broadest dimension of the 
patch. 

• Modes like A and C constitute only a small mi- 
nority, with typical modes being more like B, with 
comparable oscillations in the horizontal and verti- 
cal directions. For very oblong patches, the window 
function is a factor of v2 narrower for mode B than 
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for mode A, i.e., it is still determined by the nar- 
rowest direction alone. 

• The three cases compared in Fig. || are all "typi- 
cal" modes like B, but with patches whose aspect 
ratios are 1, 4 and 16, respectively. These modes 
are illustrated to the lower left in Fig. ||, and show 
why the skinniest patch produces the worst result. 



F. Lesson 2: how to choose the map shape 

Above we saw how the window functions resulting from 
oblong sky patches could readily be understood in a two- 
dimensional Fourier space picture. What does this tell 
us as regards the best choice of patch shape? 



1. What spectral resolution is needed? 

We want to be able to resolve all small-scale features 
in the power spectrum that carry information about cos- 
mological parameters. What is this scale? Acoustic os- 
cillations occur on a scale set by the horizon size at last 
scattering, corresponding to A£ ~ 200 |3l[] for an £1 = 1 
CDM universe, and if £1 < 1, this scale At increases. 
To accurately measure the power spectrum, we clearly 
need more than one measurement per Doppler peak, but 
a spectral resolution At ~ 40 would appear adequate for 
crude measurements, and A£ ~ 10 should retain virtually 
all cosmological information. The main exception is on 
the very largest scales, where the late integrated Sachs- 
Wolfe effect can cause variations on a scale A£ ~ 1 if the 
universe has curvature of a non-zero cosmological con- 
stant. However, this is rather irrelevant to our present 
discussion, which is geared towards ground and balloon 
based experiments, since COBE has already measured 
these low multipoles with high spectral resolution, to 
near the cosmic-variance limit, and these measurements 
are unlikely to be improved before the MAP mission flies. 
Thus although some speculative models (e.g., |32)) intro- 
duce sharp features into the power spectrum, a spectral 
resolution A£ ~ 10—40 appears sufficient for constraining 
the parameters of both standard inflationary and defect- 
based cosmologies. 



2. A rule of thumb 

Above we saw that the signal-to-noise eigenmodes had 
A£ ~ 40 for round regions of diameter ~ 5°. By elimi- 
nating ringing, the maximum-resolution method can 
reduce this to At ~ 10 — 20, but there are fewer un- 
corrected modes that are this narrow, so to avoid in- 
creasing the sample variance, it is preferable to not go 
below this map size. We also saw that the picture was 
more complex for oblong regions. Although there are 




typically a small number of modes (which used alone 
would give a large sample variance) with narrow win- 
dow functions like case C, the bulk of the modes have 
their width determined by the narrowest direction of the 
survey. It is easy to see that a skinny mode centered 
at (k x ,k y ) = k(coaip,sinip) will have a window function 
width At ~ [8~ 2 cos 2 tp + 8~ 2 sin 2 f] 1 ^ 2 , so if we use all 
the modes (which is ne cessary to attain the sample vari- 
ance from Section II D), then (A£) 2 gets averaged over if 
and we obtain 



(22) 



This means that if the patch is fairly oblong (6 X » 9 y ), 
then 9 X becomes completely irrelevant to the resolution, 
which is determined only by the narrowest direction, 9 y . 
Thus although it is possible to extract some useful con- 
straints from even narrower maps, we conclude our shape 
discussion with the following rule of thumb: 

• Avoid maps that are skinnier than a few degrees in 
the narrowest direction. 

We have seen that as long as the narrowest direction 
> 5°, the situation is greatly simplifies, since all modes 
will be narrow enough in Fourier space to be cosmologi- 
cally useful. This means that one need not worry about 
weeding out the widest modes, and can attain the mini- 
mal sample variance that the map area permits. 



III. FROM SCAN PATTERN TO MAP 

The previous section discussed measuring the power 
spectrum and cosmological parameters from a map, fo- 
cusing on how to best chose its size, shape, sensitivity 
and resolution. In this section, we turn to the preceding 
step in the data analysis pipeline: reducing time-ordered 
data (TOD) to a map. Our focus will be on 1/f noise, 
and how the power spectrum of the TOD noise in the 
time-domain becomes processed into a map noise power 
spectrum in the multipole domain. Specifically, once the 
shape and size of the map have been decided as above, 
what is the best choice of scan pattern if we want to 
minimize the map noise? To what extent is it worth 
complicating the scan pattern to reduce the map noise? 



A. Mapmaking with 1/f-noise 

1. The mapmaking problem 

The CMB mapmaking problem (see jl6| for a recent re- 
view) is to estimate the map vector x of the previous sec- 
tion from M measured numbers yi, i/m, which we will 
refer to as the time-ordered data (TOD), and group into 
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an Af-dimensional vector y. Assuming that the TOD 
depends linearly on the map, we can write 



Ax 



(23) 



for some known matrix A and some random noise vector 
n. Without loss of generality, we can take the noise vector 
to have zero mean, i.e., (n) = 0, so the noise covariance 
matrix is 

N=(nn*). (24) 



2. The solution 

All linear methods can clearly be written in the form 

x = Wy, (25) 

where x denotes the estimate of the map x and W is 
some N x M matrix that specifies the method. If we 
make the choice 



W = [A'MAj-^'M, 



(26) 



where M is an arbitrary M x M matrix, then WA = I, 
which means that the reconstruction error e. defined as 



e = x - x = [WA - I]x + Wn 



(27) 



is independent of x. In other words, the recovered map 
x is simply the true map x plus some noise that is inde- 
pendent of the signal one is trying to measure. We will 
therefore refer to e as the noise map, and study how its 
statistical properties depend on the scan strategy (spec- 
ified by A) and the detector noise characteristics (given 
by N) . Equations ( p5| ) and ( p7j ) show that its covariance 
matrix £ = (ee*) is given by 



WNW' 



[A* MA] - 1 [A* MNM A] [A* MA] " 1 

(28) 



if the matrix M is symmetric. 

When chosing M, it is clearly desirable to minimize the 
diagonal elements of X, the noise variance in the map, 
which gives M = N _1 and S = [A* MA] -1 . However, 
noise correlations manifested as off-diagonal elements in 
S may also appear undesirable, and one might fear that 
there is a tradeoff between these two evils that muddles 
the issue as to how to choose M. Fortunately, this is 
not the case. M = N -1 is the best possible method in 
the sense that the map it produces can be shown to 
retain all the cosmological information from the TOD, 
even if the map is non-Gaussian, and it has also been 
shown to be numerically feasible |15|] , so there is no need 
to settle for anything less. If for instance Wiener- filtered 
or Maximum-Entropy filtered maps are desired, these can 
always be computed directly from x afterwards, without 
recourse to the TOD. 



3. Practical issues 

Although direct application of equations (^5|) and ( psj ) 
using a standard linear algebra package gives what we 
need (x and S) in principle, this would be too slow 
to be useful in practice, since N is an M x M matrix 
and M is typically between 10 6 and 10 10 @. Fortu- 
nately, this can be remedied by some numerical tricks. 
A useful way of implementing the mapmaking algorithm 
described above was recently presented by Wright Jl5| . 
It handles the inversion implicit in Eq. (|25| ) by solving 
for the vector x iteratively, with the conjugate gradi- 
ent method, never computing S, which means that one 
avoids inverting large matrices explicitly. In the present 
paper, we specifically need the map noise covariance ma- 
trix S, to compute the noise power. Below we present 
tricks enabling explicit calculation of £ for huge data sets 
as long as the number of pixels N < 10 4 — 10 5 , which 
should prove useful for some upcoming ground and bal- 
loon based CMB experiments. The tricks make use of the 
fact that all three of the huge matrices involved have very 
special properties: A is extremely sparse, and N and M 
can be replaced by matrices that are both band-diagonal 
and circulant. Our treatment is slightly more general 
than Wright's Fourier approach in that it treats dis- 
creteness and edge effects exactly and is applicable also 
if data blocks are too short to allow one to pre-whitcn 
the noise exactly. 



4- The circulant matrix trick 

As this and the subsequent section are rather techni- 
cal, the reader not interested in data analy sis per se is 
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encouraged to jump directly to Section 

A square matrix C is said to be circulant J |33( ] if each of 
its rows is merely the one above it cyclically shifted one 
notch to the right, i.e., if Cj+ij+i = C.y, understood 
(mod M) for an M x M matrix. As we will return to be- 
low, circulant matrices have the useful property of being 
extremely fast to invert and multiply. 

Assuming that the statistical properties of the detector 
noise are independent of time, the correlation between 
the noise n(t) at two different times will depend only on 
the time separation: (n(t)n(t')) — c{t — t') for some time 
correlation function c (which is by definition symmetric; 
c(— r) = c(r)). Assuming that the measurements in the 
TOD are made at a uniform rate in time, separated by 
some constant time interval At and starting at some time 
to, the noise covariance matrix TV thus takes the form 

= (n(t + iAt)n(t + jAt)) = c(\i - j\At). (29) 

For instance, the M = 5 case can be written 
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N 



/ c ci c 2 c 3 c 4 \ 

Ci c Ci c 2 c 3 

C 2 Ci C Ci c 2 

C3 C 2 Ci C Ci 

\C 4 C 3 C 2 Ci C / 



(30) 



where we have defined the noise correlations 



c(raAt). 



(31) 



It would be numerically useful if this were a symmetric 
circulant matrix. However, the above definition shows 
that the M — 5 symmetric circulant matrix takes the 
form 



Nr. 



/ C Cl C 2 C 2 Ci \ 

Ci C Cl c 2 c 2 

C 2 Cl C Cl c 2 

C 2 C 2 Cl C Cl 

\Ci C 2 C 2 Ci c / 



(32) 



In other words, the requirement that it wraps around 
modulo M specifies the upper right and lower left cor- 
ners, requiring that C4 = ci and c 2 = c 3 , which would 
correspond to the correlation between the first and last 
observation equaling that between the first and second 
one, etc. However, it is useful to decompose N as a sum 
of a circulant and a non-circulant matrix, as 

N = N C + N S , (33) 

where for our M — 5 example, the latter is given by 



N, = 







c 3 - c 2 



c 3 - c 2 c 4 - ci \ 

c 3 - c 2 1 





\c 4 - ci c 3 - c 2 







(34) 



/ 



The subscript s denotes sparse, since as we will see in 
the next section, we can make the correlations c„ vanish 
for rc 3> 1. If there is some integer L <C M such that 
c„ = for n > L, then N s will contain merely L(L + 1) 
non-zero elements, and it will be trivial to multiply by 
(which as we will see is all we need to do with it). The 
circulant matrix N c will be band-diagonal and contain 
(2L + 1)M nonzero numbers, i.e., a factor ~ 2N/L ^> 1 
more than N s . For typical applications, L ~ 10 — 100 and 
M ~ 10 6 — 10 10 , so when performing matrix operations 
with N, the N c -term will completely dominate over the 
N s -term. Specifically, N _1 « N^ 1 . We now come to our 
first speed trick. Our mapmaking algorithm computes 
the correct X for the resulting map x for any choice of 
M. Minimizing the map noise variance gave M = N _1 , 
so this variance will clearly increase only to second order 
if we change M slightly. Let us take advantage of this 
by replacing the strictly optimal choice M = N _1 by the 
more convenient choice 



m = n; 



(35) 



To proceed, we need to be able invert the M x M ma- 

1 /2 

trix M. Being able to compute N c is also useful at 
times, since it enables one to make Monte Carlo simula- 
tions of the noise using the equation n = N 1 / 2 z ss Nj^z, 
where z is a vector of uncorrelated normalized Gaussian 
random variables. We now describe how to do both. The 
action of any function on a symmetric matrix is defined 
as the corresponding real-valued function acting on its 
eigenvalues: Since all symmetric matrices C can be di- 
agonalized as 



C = RAR 



(36) 



where R is orthogonal (RR* = I) and A = diag{A;} is 
diagonal and real, one can extend any mapping / on the 
real line to symmetric matrices by defining 



/(RdiagjAJR 4 ) = Rdiag{/(A l )}R t 



or more explicitly, 



/(C) mn — y] R m fcR ra fe/(Afc). 



(37) 



(38) 



It is easy to see that this definition is consistent with 
power series expansions whenever the latter converge. 

Circulant matrices have the great advantage that they 
all commute. This is because they can all be diagonalized 
by the same matrix R, an orthogonal version of the dis- 
crete Fourier matrix. If C is symmetric, positive-definite, 
circulant and infinite-dimensional (the latter is an excel- 
lent approximation as long as M L), then Eq. p^ ) 
simplifies to Q 



f(C) mn = i / f[XQp)]cos[(m 



n)(p]dip, (39) 



where X(cp), the spectral function of the matrix, is the 
function whose Fourier coefficients are row zero of C, 



AM 



(40) 



Note that /(C) is circulant as well. In particular, the in- 
verse N" 1 , which we can compute by chosing f(x) = l/x 
in Eq. j39|), will also be circulant. It is easy to see that 
multiplying two circulant matrices also produces a cir- 
culant matrix, and that this corresponds to multiplying 
their spectral functions. This is equivalent to convolv- 
ing their th rows, which is also extremely quick if both 
matrices are band-diagonal. Thus all the operations on 
circulant matrices in Eq. (|2£| ) (inverting N c to obtain 
M, multiplying M with N c , etc.), produce new circulant 
matrices, so all we ever need to store is row zero of each 
square matrix being manipulated. 

What about the matrix A? For a single-horned ex- 
periment, all its entries are zero except that there is a 
single "1" on each row. Letting Ni denote the number 
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of the pixel pointed to at the i observation (at time 
t = to + iAt), we have = 1 if N{ = j, Ay = oth- 
erwise. This makes it very simple to multiply by both A 
and A*. For instance, for any vector a, we can compute 
the vector b = A* a by a single loop over i — 1, M Jl5| : 
b(Nj) := b(Nj) + a(i), simply summing the temperature 
measurements of each pixel. Multiple beams introduce 
virtually no additional difficulty. For double-horned ex- 
periments like COBE and MAP, there are simply two 
non-zero entries in each row, a "1" and a "—1". 



5. A trick for making the matrices band-diagonal 

The computation of the matrices [A* MA] and 
[A'MNMA] can be further accelerated by making the 
circulant matrices N c and M band-diagonal. 

a. The noise model: The noise correlations c„ from 
Eq. (Enl) can be computed as 



1 



P(ui) cos(nuAt)du), 



(41) 



where the noise (time) power spectrum P{lo) is simply 
the Fourier transform of the time correlation function 
c(t). The noise characteristics of most CMB detectors 
can be well fit by an expression of the form: 



i 



to 



(ft 



\W{ut 



(42) 



The three terms in square brackets correspond to white 
noise, 1// noise and so-called brown noise, respectively, 
and the "knee" frequencies u>k and u>b determine where 
they yield the same power as the white noise. (The angu- 
lar frequency u> is related to the frequency / by u> = 2irf.) 
Most CMB detectors have no brown noise component 
(u>b = 0) — we are including it here for pedagogical rea- 
sons, since it turns out to be very simple to understand 
its effects, and the properties of 1/ f noise are intermedi- 
ate between the simple white and brown cases. W is a 
window function specifying what kind of analog smooth- 
ing (convolution) was performed on the time signal be- 
fore sampling it. Here we will follow by assuming 
"boxcar" smoothing where yi is the average of the signal 
measured during a time interval At. This corresponds 
to W(t) = 0(At/2 — |r|)/At. Fourier transforming this 



gives 



W(u>) = jo (a, At/2), 



(43) 



where jo(x) = sinx/x. 

Substituting Eq. (|4l| ) into Eq. ([i"o|), we see that the 
relation between the power spectrum and the spectral 
function is 



\(<p)= J2 P[iv> + 2nn)/At], 



(44) 



where P{—uj) — P(lo). In other words, the power spec- 
trum simply "wraps around" onto itself many times, with 
all power above the Nyquist sampling frequency ir/At 
getting aliased down to lower frequencies. 

b. The white noise case: White noise alone (u>k = 
LJb = 0) gives the trivial case of uncorrelated noise: c n 
vanishes except for n = 0, so N oc I, M oc I, and the 
mapmaking reduces to simply averaging measurements 
of each pixel in the map. The variance Y,u in each map 
pixel is simply a 1 divided by the number of times it was 
observed, so if the sky patch has been covered uniformly, 
we obtain the familiar case £ oc I corresponding to white 
noise in the map, whose angular power spectrum is given 
by Eq. (@). 

c. The correlated noise case: If 1// noise or brown 
noise is present, then the integral in Eq. ( [i"l| ) diverges 
at low frequencies. This means that slow drifts will com- 
pletely dominate the noise, and that all the coefficients c„ 
will be equal (the noise at any two times will be perfectly 
correlated). This is of course not a problem in practice, 
since we can remove these slowly varying offsets - it is 
merely a numerical nuisance, and is easily eliminated by 
replacing the TOD y by a high-pass filtered data set 



Dy, 



(45) 



where D is some appropriate circulant matrix. The new 
noise covariance matrix becomes 



N = ((Dn)(Dn)*) = DND* 



(46) 



Using y instead if y as the starting point for the map- 
making process, Eq. J23] ) becomes y = Ax + n, where 
A = DA, so equations (^5|) and (|2^) follow with tildes 
on all matrices, or explicitly, eliminating all tildes, 

S = [A'D'MDA] _1 [A'D'MNMDA] [A'D'MDA]" 1 , 
W = [A^MDA^A'D'MD (47) 

where N = N c + N s as before and M = Nj 1 . 

Fig. ^ shows the effect of the simple choice where all 
components of D vanish except Tin = — 1 and D, v ; + i = 1. 
This corresponds to simply taking differences of consecu- 
tive observations: yi = yi+\ — yi, and row zero of D (the 
convolution filter) is plotted in the top panel. The bot- 
tom three panels show that whereas N was pathological 
with non-zero and constant correlations extenting arbi- 
trarily far from the diagonal, N is almost diagonal. These 
correlation functions were computed as follows. Eq. (|4p| ) 
shows that the spectral function of D is X(ip) — e ltp — 1, 
so that of the matrix DD* is \e lv - 1| 2 = 4sin 2 ((^/2). 
N = DND* w DN C D* = N C DD* can therefore be com- 
puted explicitly by combining equations (|39| ) and (|44|), 
which gives 



c n oc 



n— — oc 



I ^(Dsin^D^cosM^, (48) 

where a =0, -1 and -2 corresponds to white, 1/f and 
brown noise, respectively. Performing the integral for 
these three cases gives 
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2<5on — 6i\ n \ for white noise, 
c n oc ^ </>(«) for 1/f noise, 

Son for brown noise, 



(49) 



where the function <f> is given by 

<j>(n) = (n- 2) 2 ln|n- 2| - 4(n - l) 2 ln|n- 1| +6n 2 ln|n| 



4(n+ l) 2 ln|n + 1| + (n + 2) 2 In |n + 2|, 



(50) 



and OlnO is to be interpreted as 0. For n > 2, this is 
accurately approximated by 



<f>(ri) 



(51) 



which shows that even the 1/f noise, which produces the 
widest correlation function of the three types, is roughly 
band-diagonal and can be safely truncated at say n > 
L ~ 100. 

As the figure shows, brown noise has the property that 
all the differences are uncorrelated. Thus the noise n(t) 
exhibits Brownian motion over time, which explains its 
nickname. Brown noise drifts like t 1 / 2 over time, whereas 
1/f noise is much milder in that it drifts only logarith- 
mically in t. 

When faced with a noise time stream n from real data, 
a good way to diagnose is is to compute the differences 
hi = ni+i — rii and estimate Ck as the time average of the 
product hihi + k- Fitting this with a linear combination 
of the three templates in Fig. [7] will indicate the level at 
which the three basic types of noise are present, although 
for a more accurate model, it is better to compute the 
noise spectral function directly by substituting the mea- 
sured c„-coefficients into Eq. (|40|). 

d. Pre-whitening To be able to make maps with 
Eq. ([47|), we want all the circulant matrices that appear 
to be close to diagonal. We saw that when D is simply the 
differencing matrix, D and N c are indeed band-diagonal. 
But what about M, the inverse of N c ? Fig. |^ shows the 
spectral function \(<p) of N c and N c when all three types 
of noise are present. For this case, X(<p) > for all tp, so 
its inverse, which is the spectral function of M, will be 
smooth and well-behaved, giving a band-diagonal M as 
desired. If there is no brown noise component, however, 
we will have A(ip) oc \<p\ for \ip\ « (lower panel), so the 
spectral function of M blows up near the origin and M 
will have inconvenient non-zero elements arbitrarily far 
from the diagonal. (Differencing multiplies by f 2 near 
the origin, so this "overkill" oil/ f noise produces an M- 
matrix with 1/f noise.) This can be remedied by a better 
choice of D, whose spectral function exactly neutralizes 
the 1/f-noise at the origin. A simple choice that does 
this is the D whose spectral function is | sin(cp/2) | 1//2 , 
and is a "half-difference" in the sense that doing it four 
times is equivalent to double differencing, which we saw 
multiplied the spectral function by sin 2 (<z>/2). The ex- 



plicit convolution filter is plotted in Fig. g, and is seen 
to keep both white and 1/f noise close to diagonal. An- 
other attractive option |lq] is to prewhiten the data, by 



chosing the high-pass filter D to have a spectral function 
that is the inverse square root of the spectral function of 
N c . This reduces N c (and hence also M) to the identity 
matrix, so D*D is the only circulant matrix remaining in 
Eq. (|47]). We remind the reader that all choices of D pro- 
duce the exact same answer, so the choice is merely one 
of numerical convenience. The choice of D makes very 
little difference in practice as well, as long as one ensures 
that the resulting spectral function of N c is smooth and 
non-zero, since multiplying the various circulant matrices 
together in Eq. (^) is virtually instantaneous compared 
to the other numerical steps. 



B. Case study 3: four scan patterns 

Let us consider a square sky patch of diameter 8°, di- 
vided into N = 32 x 32 pixels, scanned in four different 
ways as illustrated in Fig. [h]: 

1. Serpentine scan: the beam sweeps back and 
forth horizontally, gradually shifting downward, 
not crossing its path until the entire patch has been 
covered. This is reversed, then everything is re- 
peated. 

2. Grating scan: a serpentine scan is augmented 
with an equal amount of time spent scanning up 
and down along the left and right edges. 

3. Fence scan: two sets of serpentine scans are per- 
formed in succession, one horizontal and one verti- 
cal (rotated by 90°). 

4. Random scan: The beam jumps to a random 
pixel after each observation, but in such a way that 
all pixel pairs are connected equally many times. 

In all cases, we make M — 2 21 ~ 2 x 10 6 observations. 
These simple scanning strategies span the entire range 
of "connectedness" available in real- world experiments, 
with the serpentine scan being the least connected one 
possible and the random scan at the other extreme. An 
experiments with disjoint strips such as Tenerife is more 
similar to the serpentine case, whereas double-beam dif- 
ferencing experiments such as COBE are very well con- 
nected and more similar to the random case. A Planck 
scan strategy pattern with great circles (pointing 90° 
away from the spin axis) would be reminiscent of the grat- 
ing case, with disjoint strips (in this case circular arcs) 
connected together at two points (at the poles, where 
they circles). Several recently flown and proposed bal- 
loon experiments have linear or circular scans intersect- 
ing at a variety of angles, which makes them similar to 
the fence case. If Planck points 70° away from the spin 
axis, as originally proposed, its scan pattern would also 
be rather fence-like. 

To be able to isolate how the features of these scan pat- 
terns affect the ability to minimize various types of noise, 
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let us first study the effect of white and 1// noise sepa- 
rately, then compute some cases where they are present 
in combination. 



C. Measuring the noise power spectrum 

We first note that pixel noise strictly speaking does not 
have a power spectrum at all in general, since its statis- 
tical properties are not isotropic. Rather, the quantity 
of interest is how much power it adds to our estimates of 
the CMB power spectrum coefficients (the expectation 
value of this noise contribution is of course subtracted 
out to make the CV-estimates unbiased, but the noise 
still contributes to the error bars on these estimates). We 
will therefore compute the noise power just as we would 
compute the CMB power, using the minimum-variance 
method p"^ ]. Using a simple white noise prior, this cor- 
responds to computing 



E. Pure 1// noise 



tr P^S 
tr V e C 



where the matrix P f is given by 



(52) 



(53) 



and the Pg are Legendre polynomials. C is the covari- 
ance matrix that would result from a white noise power 
spectrum; C = J2i( 2i + 1)P^ 2 /4tt. 



D. Pure white noise 

Our four scan patterns were chosen such that all pixels 
are observed the same number of times (except for the 
pixels on the left and right edges of the grating scan). 
This means that if only white noise is present, we will 
have uniform uncorrelated map noise (S cx I), and the 
simple expression in Eq. fjlj) applies. This is the lower- 
most line plotted in Fig. O. To draw attention to the 
simple shapes of the noise power spectra, we are not in- 
cluding the beam smearing effect here (Bg = 1), which 
would otherwise make C™ mse blow up exponentially for 
large I. Note that the curve is only horizontal (as pre- 
dicted by Eq. ([!])) on the scales probed by the experi- 
ment. On scales comparable to the pixel separation, ar- 
tifacts appear (this is irrelevant when the map is properly 
oversampled, since beam smoothing destroys any CMB 
signal on these scales). On scales larger than the patch 
size (corresponding to £ < 30), the noise power drops as 
l 2 since the mapmaking algorithm is insensitive to the 
monopole (mean) of the map - this occurs automatically 
when 1// noise is present, as the method removes base- 
line drifts. (The matrix [A*D'MDA] to be inverted in 
Eq. ( J47| ) will have one vanishing eigenvalue, correspond- 
ing to the mean, which is dealt with using the pseudo- 
inverse approach described in the appendix of |l7|]). 



The other curves in Fig. 12 show the effect of pure 1/f- 
noise. Note that there is no scale in the problem other 
than the patch size (to the left of which the monopole 
removal starts suppressing the power) and the pixel sep- 
aration scale (where irrelevant artifacts appear and we 
have truncated the curves), so it should come as no sur- 
prise that the curves are rather featureless between these 
two scales. (The 1// knee frequency cannot imprint a 
feature here, since it is of course only defined when there 
is white noise present.) The normalization is arbitrary 
- doubling the receiver noise merely doubles the power 
spectrum. 

The random scan pattern is seen to produce a beau- 
tiful white noise power spectrum, indistinguishable in 
shape from the above-mentioned white noise spectrum 
plotted beneath it. This is quantitative verification of 
the claim that a well-connected "messy" scan differ- 
encing widely separate parts of the sky produces a map 
with virtually uncorrelated pixel noise. This is also seen 
in Fig. [ill which shows how correlated different map pix- 
els are with the one in the center. Numerical inspection 
of the covariance matrix X shows that it is to a good 
approximation proportional to the identity matrix, with 
the mean of all rows and columns subtracted off due to 
the monopole removal. Both the fence and grating scans 
have roughly 



C" 



(54) 



over the range of scales probed by the experiment. In 
other words, their angular power spectrum obeys the 
same power law in £ as their time power spectrum does 
in /. Thus although the r.m.s. noise per pixel (which is 
dominated by the contribution from t around the pixel 
separation scale, where the three power spectra are com- 
parable in magnitude) are quite similar for the grating, 
fence and random scans, the first two give substantially 
more power than the third on larger scales. This is be- 
cause there are no "short cuts" from one part of the map 
to the other, so that large-scale drifts inevitably leak from 
the time stream into the spatial noise distribution. An- 
other way of interpreting this excess large-scale noise is 
that although the pixel r.m.s. may be small, neighbor- 
ing pixels are correlated so that the effective number of 
independent pixels is reduced. 

Small-scale connectedness helpful as well, as the figure 
shows. The four power spectra rank in the same order 
as their degree of connectedness on all scales (the only 
exception being the grating scan, where the small-scale 
noise is raised since half of the time was spent on the 
side bars). The serpentine scan is a particularly poor 
performer, with a full order of magnitude more noise 
power than the fence scan on most scales. The source 
of the problem with the serpentine scan is illustrated in 
Fig. nil The fence scan would produce correlation stripes 
shaped like a + symbol if the map were made by simply 
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averaging the observations of each pixel, as is optimal 
for white noise. Because of the high degree of intercon- 
nectedness, however, the data analysis method is able to 
eliminate this striping, and the correlation region is seen 
to be fairly round. For the serpentine scan, however, the 
correlation stripe along the scan path persists. This is 
because, as is easy to show, the matrix W of Eq. ( |25| ) 
becomes the same as for the white noise case in the ab- 
sence of interconnections, apart from removing an overall 
drift over the entire serpentine. In other words, the so- 
phisticated mapmaking method is powerless against 1// 
noise when the scan pattern is poorly connected. 

The situation is seen to be rather intermediate for the 
grating scan: the correlation is strong along the scan path 
until it reaches the side bars, where it gets connected with 
all the other rows. 



F. White and 1/f noise combined 

Fig. [l3| shows the noise power spectra resulting from 
a combination of white and 1/f noise, where the knee 
frequency is a tenth of the sampling rate. Once again, 
the better connected scan strategies are seen to produce 
less noise power, although the fence scan is actually very 
marginally better at the smallest scales. The random 
scan is seen to produce white noise as usual, whereas 
the logarithmic slope of (7™ mse for the other scan strate- 
gies is intermediate between the 1/f and white cases of 
— 1 and 0. (We have omitted the grating case to avoid 
over-crowding this plot.) The noise power from maps 
containing the white and 1/f components alone are also 
plotted here for comparison, and it should be noted that 
the total power when both are present is always slightly 
greater than the sum of these curves (even though all 
noise components of course add when W is held fixed), 
since we cannot optimize for two different types of noise 
at the same time. 



G. Lesson 3: how to choose the scan strategy 

Connectedness is clearly desirable since it reduces the 
contribution from 1/f noise to C™ olse . It is also use- 
ful for reducing the susceptibility to systematic errors 
p5| , and makes the maps easier to analyze by making 
noise correlations more isotropic. However, complicated 
interconnected scan patterns can also create problems. 
They might complicate the experimental design, perhaps 
requiring additional moving parts which can cause sys- 
tematic problems. For ground and balloon based exper- 
iments, a strategy requiring scans with non-constant el- 
evation can introduce systematic modulations, since the 
amount of atmosphere that the beam must penetrate will 
vary with time. The relevant question is therefore how 
great efforts it is worth expending to increase the con- 
nectivity of the scan pattern. 



Eq. ( |l4] ) shows that to accurately constrain cosmolog- 
ical models, we want to minimize the variance (ACM 2 
for each multipole. As we discussed in Section II D, it 



is best to chose the map size so that noise and sample 
variance contribute roughly equally to AC^ on the pixel 
scale, i.e., at the right edge of the curves in Fig. |l3|. The 
sample variance scales with £ like the CMB power spec- 
trum, which is included in Fig. |l3|for comparison. It lacks 
the familiar rise at the Doppler peaks simply because we 
have plotted Cg rather than the more familiar quantity 
£(£+l)C e . Since we found that C]/ mse never falls of faster 
than l~ x (this was for pure 1/f noise, and a white noise 
component further reduces the slope) whereas Ci £S £~ 2 , 
AC'e will be almost completely dominated by sample vari- 
ance for all but the largest £-values probed. This means 
that the huge visual differences between the noise power 
spectra are in fact relatively unimportant when it comes 
to measuring cosmological parameters, with the only re- 
ally important quantity being the power on the pixel 
scale, which is roughly proportional to the pixel variance 
a 2 , a turns out to to be only 16% smaller for the random 
scan than for the fence scan for pure 1/f noise, and when 
we included white noise, the fence scan was actually the 
marginally better one (by 6%). Our only clearly unde- 
sirable scan is the serpentine option, which adds noise 
power even on the smallest scales and whose r.m.s. pixel 
noise is almost a factor of two worse than the fence and 
random scans (this ratio of course depends strongly on 
the knee frequency /&). 

In conclusion, it is desirable to invest a moderate but 
not extreme effort into making the scan pattern more 
connected than the technically most convenient option. 
For a ground- or balloon-based experiment, a serpentine- 
like scan can be readily made more fence-like by moving 
the sky patch to be mapped further away from the equa- 
tor, so that repeated scans at constant elevation will cross 
due to Earth's rotation. Likewise, a grating-like Planck 
great circle scan pattern can be made more fence-like by 
reducing the angle between the beam and spin axis, and 
still more by occasionally tilting the spin axis out of the 
ecliptic plane. On the other hand, going beyond fence 
connectivity, where one already has nice isotropic pixel 
noise with good systematics cross-checks, does probably 
not warrant the effort unless it can be done in a techni- 
cally elegant way such as for MAP that does not intro- 
duce new potential systematic problems. 



IV. CONCLUSIONS 

We have discussed the various tradeoffs faced when 
designing a CMB mapping experiment from the point of 
view of maximizing the scientific "bang for the buck" . 

Although the traditional approach to this problem 
has been numerically expensive Monte Carlo simulations, 
we have taken a no-simulation approach. We found 
that although state-of-the art data analysis techniques 
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such as signal-to-noise eigenmode analysis and minimum- 
variance power spectrum estimation are normally treated 
as black-box methods, their results can often be accu- 
rately approximated by simple analytic expressions. This 
allows an intuitive understanding of how changing the 
various experimental parameters affects the ability to 
constrain cosmological models. Illustrating these causal 
relationships with simple case studies, we arrived at the 
following rules of thumb. 

• Size: For a given resolution and sensitivity, it is 
best to cover a sky area such that the signal-to- 
noise ratio per resolution element (pixel) is of order 
unity. 

• Shape: It is best to avoid excessively "skinny" ob- 
serving regions, narrower than a few degrees. 

• 1/f- noise: Scan strategies of both the fence type 
and the random type allow the map-making al- 
gorithm to substantially reduce the effect of 1/f 
noise, which makes the noise correlations more 
isotropic and produces a noise power spectrum of 
slope between £° and Since this is much flat- 
ter than the true CMB spectrum is expected to be, 
slight large-scale noise modulations are cosmologi- 
cally unimportant when the map size is chosen as 
suggested above, being dwarfed by sample variance. 
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APPENDIX A: THE NOISE POWER SPECTRUM 
WITH INCOMPLETE SKY COVERAGE 

In this Appendix, we derive Eq. ([!]) . Knox first treated 
the case f s k y = 1 , and early incorrect generalizations 
to the general case were corrected by Magueijo & Hobson 
|Q . Since no detailed derivation has yet been published 
for this case, we provide one here for completeness. 

When the pixel noise is uniform and uncorrelated, the 
quantities ni (the noise in the i th pixel) are random vari- 
ables satisfying 



{fliTlj 



n a 2 . 



(55) 



Ignoring beam smoothing for the moment, we want to 
show that the effect of this discrete pixel noise is the same 
as if there were a continuous random field n(f) on the sky 
with power spectrum C™ mse = C = ila 2 /N. This is a 



white noise power spectrum, since it is independent of £, 
which corresponds to a Dirac delta correlation function 



(n(f)n(f')) = CV5(r,r'). 



(56) 



As long as the pixelization is uniform and fine enough, 
we can approximate the integral of any function / over 
our patch (of solid angle fi) by a sum: 



r o N 



(57) 



When performing a statistical analysis of a CMB map 
(for instance a signal-to- noise eigenmode analysis), we 
always expand it in some functions, say ip,tp',..., so the 
noise in these expansion coefficients, say a, a', is given 
by 



n 

N 



N 



(58) 



i=l 



For Gaussian noise, the statistical properties of these 



coefficients are com 
which using Eq 



aletely specified by their covariance, 
is given by 



/OX 2 N N 

^ ' 1=1 7 = 1 



o \ 2 N 

' »=i 



(59) 



When computing this covariance as if the noise where a 
continuous random field, Eq. (|57|) gives 



a « / -0(r)n(r)d57, 



(60) 



and using Eq. (56), we obtain 

tP(r)i;'(r'){n(r)n(r r ))dndri' 
C I ^(r)V'(r)rfn (61) 



nc 



N 



5>(W(?i) 



(62) 



i=l 



Comparing equations (p9| ) and (|6l]), we see that the two 
ways of treating the noise give the same answer if C — 
ila 2 /N. All that remains to prove Eq. (|l|) is to divide 
the right hand side by the beam correction B 2 , noting 
that the noise is added to the sky signal after it has been 
smoothed by the experimental beam. 
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FIG. 1. The signal-to-noise eigenmodes are plotted for a circular sky patch of 10° diameter using a flat fiducial power 
spectrum. The modes plotted are, from left to right, top to bottom, 1, 2, 3, 4, 6, 10, 30, 50, 100, 150, 300 and 500. 
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Mode number 

FIG. 3. Understanding signal-to-noise eigenmodes: exact calculations and approximations. The signal-to-noise ratio 

A l/2 

IS 

plotted (dots) for three exact numerical calculations together with the approximation of Eq. ( p"l| ) (dashed lines) . From top to 
bottom, the three cases correspond to complete sky coverage with uj" 1 = 7x 10~ 15 , a disk of diameter 10° with w^ 1 = 9x 10 -16 , 
and a 5° disk with -u;" 1 = 2 x 10" 15 . 
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FIG. 4. Eigenmodes in real space and Fourier space. The window functions are plotted for the signal-to-noise eigenmodes 
with H?'* ~ 220 for three rectangular sky patches of the same area (64 square degrees), and are seen to be wider for the skinnier 
patches. The spatial eigenmodes themselves are also shown (inset). 
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FIG. 5. The signal-to-noise eigenmodes 53, 52 and 51 (from top to bottom) and the corresponding window functions are 
plotted for a 2° x 32° rectangular sky patch. 
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FIG. 6. Understanding window functions. The two-dimensional Fourier transforms of the three eigenmodes from the 
previous figure are schematically illustrated by the ellipses at A, B and C. The width of the one-dimensional window functions 
corresponds to their radial extent, i.e., to how many of the circles they cross, so C gives a much narrower window than A in 
Fig. H. The situation for the eigenmodes in Fig. His illustrated by the ellipses to the lower left. 
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FIG. 7. When each measurement is subtracted from the one following it (using the differencing filter in the top panel), the 
correlation functions resulting from white, 1// and brown noise take the form shown in the three lower panels. 
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FIG. 8. When the time stream is convolved with the "half differencing" filter in the top panel, the correlation functions 
resulting from white and 1// noise are as shown in the lower panels. As opposed to in the previous figure, the 1/f correlation 
function does not sum to zero, which makes M band-diagonal. 
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FIG. 9. Spectral functions. The top panel shows the spectral function for a sample noise covariance matrix (solid line) and 
its decomposition into white, 1/f and brown noise (dashed curves). The bottom panel shows the same spectral function after 
differencing the data (dashed curve), corresponding to multiplication with sin 2 (<p/2). In the absence of brown noise (lower solid 
curve), A(0) = which is inconvenient for computing N _1 , but this problem can be eliminated by using different high-pass 
filter - the upper solid curve differs by a factor | sin(i£>/2)|. 
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FIG. 10. Schematic illustration of the four scan patterns described in the text. 



25 



FIG. 11. The correlation between the various pixels and the one in the center is plotted for the serpentine (upper left), 
grating (upper right), fence (lower left) and random (lower right) scan patterns for the case of pure 1// noise. 
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FIG. 12. The noise power spectrum C" mse is plotted for our four scan patterns given pure 1// noise. The corresponding 
noise power for white detector noise (which is identical for the serpentine, fence and random scans) is plotted below for 
comparison, as well as a standard CDM power spectrum (top). The straight line has slope just like the serpentine and 
fence power spectra. 
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FIG. 13. The noise power spectrum C" mse is plotted for the serpentine, fence and random scans with a combination of 
white and 1/f noise (solid curves), only the 1/f component (dashed curves) and only the white component (dotted curve, 
identical for all three scan patterns). A standard CDM power spectrum is plotted for comparison (top). 
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